Data

Example structure of a “good” guide

Viral secondary structure

  • Manfredonia et al. NAR (2020)
    • SHAPE-MaP/DMS-MaPseq of in vitro refolded viral genome in ~500bp tiles
    • data: sequences of single-stranded (low Shannon entropy, high SHAPE) regions (3599 nt; 12.0% of genome)
    • per-guide metric: % of target labeled as single-stranded
  • Lan et al. bioRxiv (2020)
    • DMS-MaPseq in infected Vero E6 cells
    • data: coordinates of unstructured/structured regions (6010 nt; 20.1% of genome)
    • per-guide metric: % of target labeled as unstructured
  • Sun et al. Cell (2021)
    • icSHAPE in infected Huh7.5.1 cells & on extracted RNA
    • data: icSHAPE score [0,1] per nucleotide
    • per-the guide metric: mean icSHAPE score (higher = more likely to be single-stranded)
  • Huston et al. Molecular Cell (2021)
    • SHAPE-MaP of purified RNA from infected Vero E6 cells
    • data: RNAstructure connectivity table (CT) per nucleotide (12527 positions unbound; 41.9% of genome)
    • per-guide metric: % target labeled as unpaired
## Warning: Removed 35 rows containing non-finite values (stat_density).

## Warning: Removed 35 rows containing non-finite values (stat_density).

Correlation between annotations: over entire spacer target

  • Annotations from Manffedonia (2020) correlate poorly with other annotations
    • Only 12% of genome is labeled
    • Experimental data generated on 500bp tiling of refolded viral genome
  • Relatively high correlation between annotations generated from in vivo settings
    • Lan (2020)
    • Sun (2021)
    • Huston (2021)

Intersection of in vivo viral structure predictions

Correlation between annotations: over spacer seed region (positions 8-12 on spacer)

Exploratory data analysis: controls

Plate controls: 612_Control and No_Protein

  • Plate GS2-2 outliers: O 9-12, 21-24
  • No_Protein (+/- activator) and 612_Control (- activator) controls cluster tightly across plates
  • Variable spread of 612_Control activity with 100 fM activator

RNP-only controls

## Warning: Removed 1 rows containing non-finite values (stat_ydensity).

  • RNP-only control varies by plate (and presumably by guide)
  • Want to be able to account for additional gain in activator-dependent rate background
  • Outliers on plate GS2-1 all correspond to NCR_1344

Data analysis

Methods:

  • Mixed linear regression to compute rate per condition across triplicates:
    • per guide: \(\text{signal}_i(t) = (\beta_0 + \beta_{0,i}) + (\beta_{\text{100fM}}\cdot\mathbb{I}[\text{100fM}]) \ + (\beta_{t} + \beta_{t,i}) \cdot t \ + (\beta_{t:\text{100fM}}\cdot t\cdot\mathbb{I}[\text{100fM}])\)
    • RNP-only: \(\text{signal}_i(t) = (\beta_0 + \beta_{0,i}) + (\beta_t + \beta_{t,i}) \cdot t_i\)
    • 100fM activator: \(\text{signal}_i(t) = (\beta_0 + \beta_{0,i} + \beta_{\text{100fM}}) + (\beta_t + \beta_{t,i} + \beta_{t:\text{100fM}}) \cdot t_i\)
    • 100fM activator-dependent rate: \(\beta_{t:\text{100fM}}\)
  • Background subtract mean of empty wells
  • Exclude timepoints before 1000 seconds (first 10 timepoints show dip in some RNP-only controls)
  • Random effects (per 384-well, across timepoints): intercept, slope (wrt time)
  • Fixed effects: intercepts (\(\beta_0\), \(\beta_{\text{100fM}}\)), slopes (\(\beta_t\), \(\beta_{t:\text{100fM}}\))

Example traces:

  • dots: signal per 384-well per time
  • solid lines: regression fit

Results: 100 fM rates (above RNP-only control)

Summary of screen

  • Similar spread of rates across plates
  • Expect to see poorer rates for plates GS2-1 and GS2-2 (which harbor the “bad” guides)

Detection at last timepoint

Time to detection

Per guide: * Per timepoint: t-test for difference in signal between RNP-only and 100 fM conditions per timepoint * Perform FDR correction for number of measurements from start of experiment to timepoint * Return first timepoint for which corrected p-value < 0.05

Volcano plots of screening rates

  • Guides on plate GS2-2 have similar rates as to other plates, but are much more variable (p-values closer to 1)

Rates by position along SARS-CoV-2 genome

Structures of the two guides that performed well (rate > 1 above background) but without hairpin structure (NCR_1346, NCR_1351):

Rates across groups of guide secondary structure

  • Statistically significant (positive?) correlation between guide activity and spacer structure

Determination of how much of predicted hairpin structure needs to be maintained:

##         NCR.id                       spacer
## 120   NCR_1313         GUUUACCUUGGUAAUCAUCU
## 126   NCR_1319         UCAUUAAAUGGUAGGACAGG
## 137   NCR_1330         GCAAUCAAUGGGCAAGCUUU
## 138   NCR_1331         CUUCUCUGUAGCUAGUUGUA
## 139   NCR_1332         GAGUAAAUCUUCAUAAUUAG
## 142   NCR_1335         AUGGUGUCCAGCAAUACGAA
## 143   NCR_1336         GCCGUCUUUGUUAGCACCAU
## 155   NCR_1348         AUUAGCUCUCAGGUUGUCUA
## 156   NCR_1349         UGGUACGUUAAAAGUUGAUG
## 158   NCR_1351         UGGCUACUUUGAUACAAGGU
## 21685 NCR_1410 UGAAUGUAAAACUGAGGAUCUGAAAACU
## 9671  NCR_1412 UAUAAGCAAUUGUUAUCCAGAAAGGUAC
## 10691 NCR_1417 GAUUGAGAAACCACCUGUCUCCAUUUAU
##                                                          structure
## 120           ...((((((((.........))))................))))........
## 126           ...((((((((.........))))................))))........
## 137           .......(((((....(((...((........))..))).))))).......
## 138           ..(((.(((........(((((.........)))))......))).)))...
## 139           ...............(((((((..(((......)))...)))))))......
## 142           ................((..((((((((.....))))))))..)).......
## 143           ..............((((((((((........)))..)))))))........
## 155           (((((.((((......(((.((((............))))))))))))))))
## 156           ...((((((((.........))))........)))).((((((...))))))
## 158           (((.(((((((.........))))........))))))(((((...))))).
## 21685 ((((((.((((.........))))......((.....))........)).))))......
## 9671  ...(((.((((.........))))............(((....))).........)))..
## 10691 ...............(((((((((((......................)))))).)))))

  • “Good” guides exhibit similar 100fM rates as “bad” guides

Rates across guide ordering group

Background (RNP-only) rate by guide secondary structure

  • In guides that maintain crRNA hairpin: correlation between number of basepaired positions in spacer and background RNP-only rate

Other analyses

## Warning in cor.test.default(GC_content, Estimate, method = "spearman"): Cannot
## compute exact p-value with ties

## Warning in cor.test.default(GC_content, Estimate, method = "spearman"): Cannot
## compute exact p-value with ties
## Warning in is.na(x): is.na() applied to non-(list or vector) of type 'language'

## Warning in is.na(x): is.na() applied to non-(list or vector) of type 'language'

  • Weak negative correlation between spacer GC content and guide activity
## Warning in cor.test.default(downstream_U, Estimate, method = "spearman"): Cannot
## compute exact p-value with ties

## Warning in cor.test.default(downstream_U, Estimate, method = "spearman"): Cannot
## compute exact p-value with ties
## Warning in cor.test.default(downstream_unstructured_U, Estimate, method =
## "spearman"): Cannot compute exact p-value with ties

## Warning in cor.test.default(downstream_unstructured_U, Estimate, method =
## "spearman"): Cannot compute exact p-value with ties
## Warning in is.na(x): is.na() applied to non-(list or vector) of type 'language'

## Warning in is.na(x): is.na() applied to non-(list or vector) of type 'language'

## Warning in is.na(x): is.na() applied to non-(list or vector) of type 'language'

## Warning in is.na(x): is.na() applied to non-(list or vector) of type 'language'

  • Weak positive correlation between % unstructured U in 30 nt downstream of protospacer in viral genome and guide activity
## Warning in cor.test.default(gRNA_MFE, Estimate, method = "spearman"): Cannot
## compute exact p-value with ties

## Warning in cor.test.default(gRNA_MFE, Estimate, method = "spearman"): Cannot
## compute exact p-value with ties
## Warning in is.na(x): is.na() applied to non-(list or vector) of type 'language'

## Warning in is.na(x): is.na() applied to non-(list or vector) of type 'language'

Comparison of 20mers to 28mers

Guide activity by viral structure

## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.

## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.

## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.

## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.

## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.

## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.

  • Correlation between viral structure as annotated by Lan (2020)
    • But no correlation with other in vivo annotations (Sun and Huston)
## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.

k-means clustering of in vivo structures (Lan, Sun, Huston)

## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.

## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.

Restriction to spacer seed region (positions 8-12 on spacer)

## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.

## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.

## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.

## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.

## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.

Position regression: guide rate

## Warning: Removed 1 rows containing missing values (geom_col).

  • “Unstructured” label from Lan (2020) is most predictive
  • Also informative:
    • Base-paired-ness from Huston (2021) in position 1
    • in vitro icSHAPE score from Sun (2021) in positions 6-9
  • Spacer sequence is most informative in positions 2, 13, 17, 18

Likelihood ratio test (spacer-only vs. spacer+structure):

## Likelihood ratio test
## 
## Model 1: rate ~ spacer_1_A + spacer_1_C + spacer_1_G + spacer_1_U + spacer_2_A + 
##     spacer_2_C + spacer_2_G + spacer_2_U + spacer_3_A + spacer_3_C + 
##     spacer_3_G + spacer_3_U + spacer_4_A + spacer_4_C + spacer_4_G + 
##     spacer_4_U + spacer_5_A + spacer_5_C + spacer_5_G + spacer_5_U + 
##     spacer_6_A + spacer_6_C + spacer_6_G + spacer_6_U + spacer_7_A + 
##     spacer_7_C + spacer_7_G + spacer_7_U + spacer_8_A + spacer_8_C + 
##     spacer_8_G + spacer_8_U + spacer_9_A + spacer_9_C + spacer_9_G + 
##     spacer_9_U + spacer_10_A + spacer_10_C + spacer_10_G + spacer_10_U + 
##     spacer_11_A + spacer_11_C + spacer_11_G + spacer_11_U + spacer_12_A + 
##     spacer_12_C + spacer_12_G + spacer_12_U + spacer_13_A + spacer_13_C + 
##     spacer_13_G + spacer_13_U + spacer_14_A + spacer_14_C + spacer_14_G + 
##     spacer_14_U + spacer_15_A + spacer_15_C + spacer_15_G + spacer_15_U + 
##     spacer_16_A + spacer_16_C + spacer_16_G + spacer_16_U + spacer_17_A + 
##     spacer_17_C + spacer_17_G + spacer_17_U + spacer_18_A + spacer_18_C + 
##     spacer_18_G + spacer_18_U + spacer_19_A + spacer_19_C + spacer_19_G + 
##     spacer_19_U + spacer_20_A + spacer_20_C + spacer_20_G + spacer_20_U
## Model 2: rate ~ spacer_1_A + spacer_1_C + spacer_1_G + spacer_1_U + spacer_2_A + 
##     spacer_2_C + spacer_2_G + spacer_2_U + spacer_3_A + spacer_3_C + 
##     spacer_3_G + spacer_3_U + spacer_4_A + spacer_4_C + spacer_4_G + 
##     spacer_4_U + spacer_5_A + spacer_5_C + spacer_5_G + spacer_5_U + 
##     spacer_6_A + spacer_6_C + spacer_6_G + spacer_6_U + spacer_7_A + 
##     spacer_7_C + spacer_7_G + spacer_7_U + spacer_8_A + spacer_8_C + 
##     spacer_8_G + spacer_8_U + spacer_9_A + spacer_9_C + spacer_9_G + 
##     spacer_9_U + spacer_10_A + spacer_10_C + spacer_10_G + spacer_10_U + 
##     spacer_11_A + spacer_11_C + spacer_11_G + spacer_11_U + spacer_12_A + 
##     spacer_12_C + spacer_12_G + spacer_12_U + spacer_13_A + spacer_13_C + 
##     spacer_13_G + spacer_13_U + spacer_14_A + spacer_14_C + spacer_14_G + 
##     spacer_14_U + spacer_15_A + spacer_15_C + spacer_15_G + spacer_15_U + 
##     spacer_16_A + spacer_16_C + spacer_16_G + spacer_16_U + spacer_17_A + 
##     spacer_17_C + spacer_17_G + spacer_17_U + spacer_18_A + spacer_18_C + 
##     spacer_18_G + spacer_18_U + spacer_19_A + spacer_19_C + spacer_19_G + 
##     spacer_19_U + spacer_20_A + spacer_20_C + spacer_20_G + spacer_20_U + 
##     structure_1_. + structure_1_structured + structure_1_unstructured + 
##     structure_2_. + structure_2_both + structure_2_structured + 
##     structure_2_unstructured + structure_3_. + structure_3_both + 
##     structure_3_structured + structure_3_unstructured + structure_4_. + 
##     structure_4_both + structure_4_structured + structure_4_unstructured + 
##     structure_5_. + structure_5_both + structure_5_structured + 
##     structure_5_unstructured + structure_6_. + structure_6_both + 
##     structure_6_structured + structure_6_unstructured + structure_7_. + 
##     structure_7_both + structure_7_structured + structure_7_unstructured + 
##     structure_8_. + structure_8_both + structure_8_structured + 
##     structure_8_unstructured + structure_9_. + structure_9_both + 
##     structure_9_structured + structure_9_unstructured + structure_10_. + 
##     structure_10_both + structure_10_structured + structure_10_unstructured + 
##     structure_11_. + structure_11_both + structure_11_structured + 
##     structure_11_unstructured + structure_12_. + structure_12_both + 
##     structure_12_structured + structure_12_unstructured + structure_13_. + 
##     structure_13_both + structure_13_structured + structure_13_unstructured + 
##     structure_14_. + structure_14_both + structure_14_structured + 
##     structure_14_unstructured + structure_15_. + structure_15_both + 
##     structure_15_structured + structure_15_unstructured + structure_16_. + 
##     structure_16_both + structure_16_structured + structure_16_unstructured + 
##     structure_17_. + structure_17_both + structure_17_structured + 
##     structure_17_unstructured + structure_18_. + structure_18_both + 
##     structure_18_structured + structure_18_unstructured + structure_19_. + 
##     structure_19_both + structure_19_structured + structure_19_unstructured + 
##     structure_20_. + structure_20_both + structure_20_structured + 
##     structure_20_unstructured
##   #Df  LogLik Df  Chisq Pr(>Chisq)   
## 1  62 -792.28                        
## 2 106 -754.14 44 76.281   0.001816 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Elastic net regression on all features

Position regression: detectability at 2-hr timepoint

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

New predictive model

Model 1: sequence + structure

  • sequence: spacer sequence +/- 5 nt
  • viral structure: spacer sequence +/- 5 nt

Model 2: only sequence

  • sequence: spacer sequence +/- 5 nt

Model 3: only structure

  • viral structure: spacer sequence +/- 5 nt

Model 4: only sequence (binary)

  • sequence: spacer sequence (A/T: 0 ; C/G: 1) +/- 5 nt

Model 5: sequence (binary) + structure

  • sequence: spacer sequence (A/T: 0 ; C/G: 1) +/- 5 nt
  • viral structure: spacer sequence +/- 5 nt

## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.

## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.

## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.

## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.

Model 6: rate ~ (antitag position 1) * (spacer structure) + (downstream unstructured U)

  • antitag position 1
  • spacer structure
  • downstream unstructured U

## 
## Call:
## glm(formula = Estimate ~ ., family = "gaussian", data = subset(model6_comparison_data_onehot, 
##     nchar(spacer) == 20, select = -spacer))
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -36.999  -11.676   -1.541    8.462   57.903  
## 
## Coefficients: (1 not defined because of singularities)
##                            Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                27.80315    2.66084  10.449  < 2e-16 ***
## antitag_pos1_A              0.06504    3.11007   0.021   0.9833    
## antitag_pos1_C              0.14718    3.54166   0.042   0.9669    
## antitag_pos1_G            -18.74385    3.50236  -5.352 2.56e-07 ***
## antitag_pos1_U                   NA         NA      NA       NA    
## downstream_unstructured_U -17.98711   14.22614  -1.264   0.2077    
## spacer_structure            8.61017    4.99932   1.722   0.0867 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for gaussian family taken to be 277.927)
## 
##     Null deviance: 63703  on 190  degrees of freedom
## Residual deviance: 51416  on 185  degrees of freedom
## AIC: 1624.8
## 
## Number of Fisher Scoring iterations: 2
## 
## Call:
## glm(formula = Estimate ~ ., family = "gaussian", data = subset(model6_comparison_data_onehot, 
##     select = -spacer))
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -39.018  -11.903   -1.100    8.999   54.453  
## 
## Coefficients: (1 not defined because of singularities)
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                27.7728     2.6271  10.572  < 2e-16 ***
## antitag_pos1_A             -0.1324     3.0318  -0.044   0.9652    
## antitag_pos1_C              0.4220     3.4438   0.123   0.9026    
## antitag_pos1_G            -17.6095     3.4210  -5.147 6.37e-07 ***
## antitag_pos1_U                  NA         NA      NA       NA    
## downstream_unstructured_U -15.0276    13.6651  -1.100   0.2728    
## spacer_structure           11.2954     4.8354   2.336   0.0205 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for gaussian family taken to be 279.9275)
## 
##     Null deviance: 67347  on 202  degrees of freedom
## Residual deviance: 55146  on 197  degrees of freedom
## AIC: 1727.8
## 
## Number of Fisher Scoring iterations: 2

## 
## Call:
## glm(formula = (Estimate > 20) ~ ., family = "binomial", data = subset(model6_comparison_data_onehot, 
##     nchar(spacer) == 20, select = -spacer))
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.7380  -1.3004   0.7803   0.9779   2.1603  
## 
## Coefficients: (1 not defined because of singularities)
##                           Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                0.59673    0.33701   1.771   0.0766 .  
## antitag_pos1_A            -0.10741    0.38788  -0.277   0.7818    
## antitag_pos1_C            -0.01259    0.44569  -0.028   0.9775    
## antitag_pos1_G            -2.68419    0.59822  -4.487 7.22e-06 ***
## antitag_pos1_U                  NA         NA      NA       NA    
## downstream_unstructured_U -1.44102    1.81316  -0.795   0.4268    
## spacer_structure           0.84591    0.69198   1.222   0.2215    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 264.15  on 190  degrees of freedom
## Residual deviance: 225.87  on 185  degrees of freedom
## AIC: 237.87
## 
## Number of Fisher Scoring iterations: 4
## 
## Call:
## glm(formula = (Estimate > 20) ~ ., family = "binomial", data = subset(model6_comparison_data_onehot, 
##     select = -spacer))
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.8540  -1.3039   0.7348   0.9664   2.0093  
## 
## Coefficients: (1 not defined because of singularities)
##                           Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                0.59537    0.33264   1.790   0.0735 .  
## antitag_pos1_A            -0.19837    0.38016  -0.522   0.6018    
## antitag_pos1_C            -0.00632    0.43958  -0.014   0.9885    
## antitag_pos1_G            -2.39508    0.52383  -4.572 4.83e-06 ***
## antitag_pos1_U                  NA         NA      NA       NA    
## downstream_unstructured_U -0.76374    1.73318  -0.441   0.6595    
## spacer_structure           1.16501    0.67537   1.725   0.0845 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 279.24  on 202  degrees of freedom
## Residual deviance: 242.82  on 197  degrees of freedom
## AIC: 254.82
## 
## Number of Fisher Scoring iterations: 4

Model 7: reduced features

  • antitag position 1
  • downstream unstructured U
  • spacer structure

Comparison to screening performed in gBlocks

## Warning in eval(substitute(expr), data, enclos = parent.frame()): NAs introduced
## by coercion
## Warning: NAs introduced by coercion
## [1] "mixed model failed: NCR_1320"
## [1] "mixed model failed: NCR_1332"
## [1] "mixed model failed: NCR_1387"
## Warning: Removed 26 rows containing non-finite values (stat_smooth).
## Warning: Removed 26 rows containing missing values (geom_point).
## Warning: Removed 5 rows containing missing values (geom_smooth).

gBlock round 2 outlier:

Figures for paper

Figure 1A (data): guide design pipeline

Figure 2A: range of observed guide activities

## Warning: Removed 2 rows containing missing values (geom_bar).

## Warning: Removed 2 rows containing missing values (geom_bar).

Figure 2B: example traces

## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.

Figure 2C: viral RNA v. gblock

Figure 3: elastic net regression + anti-tag result

## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.

## 
##  Welch Two Sample t-test
## 
## data:  subset(guide_rate$Estimate, guide_rate$antitag_pos1 != "G") and subset(guide_rate$Estimate, guide_rate$antitag_pos1 == "G")
## t = 7.1627, df = 66.208, p-value = 4.095e-10
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  14.64215      Inf
## sample estimates:
## mean of x mean of y 
## 27.452336  8.364669
## 
##  Welch Two Sample t-test
## 
## data:  subset(guide_rate$Estimate, guide_rate$antitag_label == "G") and subset(guide_rate$Estimate, guide_rate$antitag_label %in% c("GU", "GUU"))
## t = 2.4884, df = 29.681, p-value = 0.009337
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  2.606595      Inf
## sample estimates:
## mean of x mean of y 
##  9.916849  1.712470

Figure 4C: LOD with Cas13-Csm6 tandem assay

Figure 4D: robustness to genetic variants

Suppl. Figure 1A: random forest variable importance

Suppl. Figure 1B: sequence logo

## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.

## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.

Suppl. Figure 2A: GC content

Suppl. Figure 2B: hybridization MFE

Suppl. Figure 2C: cleaveable U in target context

Suppl. Figure 3A: spacer structure

## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.

## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.
## Warning: Groups with fewer than two data points have been dropped.

## Warning: Groups with fewer than two data points have been dropped.

## Warning: Groups with fewer than two data points have been dropped.

## Warning: Groups with fewer than two data points have been dropped.

## Warning: Groups with fewer than two data points have been dropped.

## Warning: Groups with fewer than two data points have been dropped.

## Warning: Groups with fewer than two data points have been dropped.

## Warning: Groups with fewer than two data points have been dropped.

Suppl. Figure 3B: structure of direct repeat

Suppl. Figure 4A: in vivo viral structure

Suppl. Figure 4B: genomic structure vs. rate

Suppl. Figure 5A: multiplex set of 40 vs. primary screen

Suppl. Figure 5B: leave-one-out counterscreen

Suppl. Figure 5C: human RNA counterscreen

## Warning in sort(as.numeric(guide)): NAs introduced by coercion

## Warning in sort(as.numeric(guide)): NAs introduced by coercion

## Warning in sort(as.numeric(guide)): NAs introduced by coercion

## Warning in sort(as.numeric(guide)): NAs introduced by coercion

## Warning in sort(as.numeric(guide)): NAs introduced by coercion

## Warning in sort(as.numeric(guide)): NAs introduced by coercion

## Warning in sort(as.numeric(guide)): NAs introduced by coercion

## Warning in sort(as.numeric(guide)): NAs introduced by coercion

Suppl. Figure 6: 32-pool vs. 8-pool w/ forced mismatch

## Warning: Ignoring unknown aesthetics: fill

gblock rates

## Warning: Removed 5 rows containing missing values (geom_point).
## Warning: Removed 2 rows containing missing values (geom_bar).

anti-tag complementarity

## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.

interaction btwn anti-tag G and spacer structure

## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.
## Warning: Groups with fewer than two data points have been dropped.

## Warning: Groups with fewer than two data points have been dropped.